Nature Methods — Latest Matching Preprints

1

improv: A flexible software platform for adaptive neuroscience experiments

Draelos, A.; Nikitchenko, M.; Sriworarat, C.; Sprague, D.; Loring, M. D.; Pnevmatikakis, E.; Giovannucci, A.; Naumann, E. A.; Pearson, J. M.

2021-02-23 neuroscience 10.1101/2021.02.22.432006 medRxiv

Top 0.1%

71.6%

Show abstract

Current neuroscience research is often limited to testing predetermined hypotheses and post hoc analysis of already collected data. Adaptive experimental designs, in which modeling drives ongoing data collection and selects experimental manipulations, offer a promising alternative. Still, tight integration between models and data collection requires coordinating diverse hardware configurations and complex computations under real-time constraints. Here, we introduce improv, a software platform that allows users to fully integrate custom modeling, analysis, and visualization with data collection and experimental control. We demonstrate both in silico and in vivo how improv enables more efficient experimental designs for discovery and validation across various model organisms and data types. Improv can orchestrate custom real-time behavioral analyses, rapid functional typing of neural responses from large populations via calcium microscopy, and optimal visual stimulus selection. We incorporate real-time machine learning methods for dimension reduction and predictive modeling of latent neural and behavioral features. Finally, we demonstrate how improv can perform model-driven interactive imaging and simultaneous optogenetic photostimulation of visually responsive neurons in the larval zebrafish brain expressing GCaMP6s and the red-shifted opsin rsChRmine. Together, these results demonstrate the power of improv to integrate modeling with data collection and experimental control to achieve next-generation adaptive experiments.

2

Recut: a Concurrent Framework for Sparse Reconstruction of Neuronal Morphology

Marrett, K.; Zhu, M.; Chi, Y.; Chen, Z.; Choi, C.; Dong, H.-W.; Park, C. S.; Cong, J.; Yang, X. W.

2022-04-08 neuroscience 10.1101/2021.12.07.471686 medRxiv

Top 0.1%

67.5%

Show abstract

Advancement in modern neuroscience is bottlenecked by neural reconstruction, a process that extracts 3D neuron morphology (typically in tree structures) from image volumes at the scale of hundreds of GBs. We introduce Recut, an automated and accelerated neural reconstruction pipeline, which provides a unified, and domain specific sparse data representation with 79x reduction in the memory footprint. Recuts reconstruction can process 111 Kneurons/day or 79 TB/day on a 24-core workstation, placing the throughput bottleneck back on microscopic imaging time. Recut allows the full brain of a mouse to be processed in memory on a single server, at 89.5x higher throughput over existing I/O-bounded methods. Recut is also the first fully parallelized end-to-end automated reconstruction pipeline for light microscopy, yielding tree morphologies closer to ground truth than the state-of-the-art while removing involved manual steps and disk I/O overheads. We also optimized pipeline stages to linear algorithmic complexity for scalability in dense settings and allow the most timing-critical stages to optionally run on accelerated hardware.

3

VesiclePy: A Machine Learning Vesicle Analysis Toolbox for Volume Electron Microscopy

Adhinarta, J. K.; Fan, Y.; Gohain, A.; Lin, M.; Nurkin, P.; Ren, R.; Roth, M.; Zhang, S.; Yakobe, A.; Yuste, R.; Wei, D.

2025-09-16 neuroscience 10.1101/2025.09.08.674799 medRxiv

Top 0.1%

66.8%

Show abstract

Vesicles are critical components of neurons that package neurotransmitters and neuropeptides for their release, in order to communicate with other neurons and cells. However, due to their small size, the reconstruction of the full vesicle endowment across an entire neuronal morphology remains challenging. To achieve this, we have used, as a tool to identify and visualize vesicles, Volume Electron Microscopy (vEM), a method that has the nanoscale resolution to detect individual vesicle boundaries, content, and 3D locations. However, the large volume of vEM datasets poses a challenge in the segmentation, classification, and spatial analysis of tens of thousands of vesicles and their target cell in 3D. Here we report the development of VesiclePy, an integrated pipeline for automated segmentation, classification, proofreading, and spatial analysis of vesicles, relative to neuron masks in large-volume electron microscopy data. Our package integrates the efficiency of deep learning and the accuracy of human proofreading and provides a streamlined package in chunked processing and accurate indexing, localization, and visualization of single vesicle resolution in large vEM data. We demonstrate the viability of VesiclePy using high-pressure frozen serial EM data of Hydra vulgaris and quantify the performance of the package using ground truth manual annotations. We show that VesiclePy can process a multiterabyte serial EM dataset, efficiently annotate 53,851 vesicles from 20 complete neurons, and classify vesicles into 5 types. Each vesicle has a unique ID and 3D location for further spatial analysis in relation to neuron or non-neuronal targets nearby. Finally, by combining vesicle data and morphological information of each neuron, we can quantitatively cluster neurons into subtypes. VesiclePy is available at https://github.com/PytorchConnectomics/VesiclePy under an MIT license.

4

UniSPAC: A Unified Segmentation Framework for Proofreading and Annotation in Connectomics

Deng, J.; Wu, J.; Chen, C.; Zheng, Q.; Zhang, Z.; Wu, J.; Ouyang, W.; Song, C.

2024-11-27 neuroscience 10.1101/2024.11.27.625336 medRxiv

Top 0.1%

66.0%

Show abstract

Reconstructing dense neuronal connections from volume electron microscopy (vEM) images is a critical challenge in neuroscience, driving the development of various automatic neuron segmentation methods1. Although current state-of-the-art automated segmentation methods can achieve high segmentation accuracy, they still require substantial manual proofreading and rely heavily on labeled datasets, which are often scarce, particularly for non-model organisms. Here, we introduce a Unified Segmentation framework for Proofreading and Annotation in Connectomics (UniSPAC) by providing the interactive segmentation model in 2D-level and the neuron tracing model in 3D-level. UniSPAC-2D allows users to correct its segmentation errors through point-based prompts, combining segmentation and proofreading in a single framework. UniSPAC-3D automatically traces neurons segmented by UniSPAC-2D across image slices, significantly reducing human involvement. Furthermore, UniSPAC-2D and UniSPAC-3D models can facilitate the semi-automatic generation of labeled data for new species, eliminating the need for external annotation tools. The fresh annotated data generated during proofreading in turn optimizes the interactive model through an online learning strategy, reducing the labeling effort for novel species over time. UniSPAC outperforms the start-of-the-art Segment Anything Model (SAM) in Drosophila segmentation, achieving 47x higher efficiency, and surpasses ACRLSD in cross-species segmentation on zebra finch data.

5

O-SNAP: A comprehensive pipeline for spatial profiling of chromatin architecture

Kim, H. H.; Martinez Sarmiento, J. A.; Palma, F. R.; Kant, A.; Zhang, E. Y.; Guo, Z.; Mauck, R.; Heo, S.-J.; Shenoy, V.; Bonini, M. G.; Lakadamyali, M.

2025-07-23 biophysics 10.1101/2025.07.18.665612 medRxiv

Top 0.1%

64.1%

Show abstract

We present O-SNAP (Objective Single-Molecule Nuclear Architecture Profiler), a comprehensive pipeline for the automated extraction, comparison, and classification of nuclear features from single-molecule localization microscopy (SMLM) data. O-SNAP quantifies 144 interpretable, biologically grounded spatial features describing chromatin organization or histone mark distributions at nanoscale resolution. The pipeline includes modules for pairwise comparison of features using volcano plots, feature set enrichment analysis, robust feature selection and classification of cell states, and pseudotime trajectory inference. We validate O-SNAP across diverse biological contexts, including fibroblast-to-stem cell reprogramming, tendon disease, histone variant sensitivity to oxidative stress, and chondrocyte de-differentiation, demonstrating its ability to detect subtle changes in nanoscale chromatin organization across diverse biological transitions.

6

Beyond Agreement: Standardizing Crowdsourced Synapse Annotations through Proofreading in EM Connectomics

Lee, S. Y.; Correia, A.; Ceffa, N.; Robbins, M.; Franco-Barranco, D.; Zlatic, M.; Cardona, A.; Mohinta, S.

2025-09-27 neuroscience 10.1101/2025.09.26.678851 medRxiv

Top 0.1%

63.8%

Show abstract

AO_SCPLOWBSTRACTC_SCPLOWReliable synapse identification in volumetric EM is hampered by subtle, 3D cues that yield variable human judgments. We present a standardized proofreading protocol that pairs explicit, operational criteria with machine-learning candidate generation and a two-stage calibration of annotators. In two larval Drosophila melanogaster volumes imaged at 8x8x8 nm, five raters (expert + 4 calibrated annotators) reviewed model-proposed candidates using efficient node-based labels. Multi-rater judgments were aggregated with a probabilistic Dawid-Skene (DS) model to produce consensus labels with calibrated uncertainty. Post-calibration, individual annotator accuracy versus the expert improved (McNemar p < 0.05 for all raters), DS-expert agreement increased, and DS posterior entropy decreased for true positives/negatives, indicating more decisive consensus; gains were modest and dataset-dependent in chance-corrected agreement (Krippendorffs ). By making uncertainty explicit, this protocol converts noisy judgments into auditable supervision suitable for training and evaluation, while honestly communicating residual ambiguity essential for reliable and robust connectomics at scale.

7

AlphaTracker: A Multi-Animal Tracking and Behavioral Analysis Tool

Chen, Z.; Zhang, R.; Zhang, Y. E.; Fang, H.-S.; Zhou, H.; Rock, R. R.; Bal, A.; Padilla-Coreano, N.; Keyes, L.; Tye, K. M.; Lu, C.

2020-12-06 animal behavior and cognition 10.1101/2020.12.04.405159 medRxiv

Top 0.1%

62.2%

Show abstract

The advancement of behavioral analysis in neuroscience has been aided by the development of computational tools1,2. Specifically, computer vision algorithms have emerged as a powerful tool to elevate behavioral research3,4. Yet fully automatic analysis of social behavior remains challenging in two ways. First, existing tools to track and analyze behavior often focus on single animals, not multiple, interacting animals. Second, many available tools are not developed for novice users and require programming experience to run. Here, we unveil a computer vision pipeline called AlphaTracker, which requires minimal hardware requirements and produces reliable tracking of multiple unmarked animals. An easy-to-use user interface further enables manual inspection and curation of results. We demonstrate the practical, real-time advantages of AlphaTracker through the study of multiple, socially-interacting mice.

8

OpenFISH enables integrated high-resolution spatial transcriptomics and metabolomics on a single tissue section

Li, X.; Huang, Y.; Wang, S.; Li, Y.; Jiang, F.; Gao, J.; Yang, Y.; Wu, Q.; Ge, W.-p.; Duan, L.

2025-08-22 molecular biology 10.1101/2025.08.19.671030 medRxiv

Top 0.1%

61.1%

Show abstract

Spatial transcriptomics enables in situ mapping of gene expression, yet no current platform provides single-cell, same-section integration with metabolomics, limiting direct links between transcriptional programsand metabolic phenotypesin native tissue. We present OpenFISH, a rapid, imaging-based spatial transcriptomics method operable on standard microscopes, requiring no proprietary hardware, and fully compatible with matrix-assisted laser desorption/ionization mass spectrometry imaging (MALDI-MSI). OpenFISH resolves hundreds of transcripts at subcellular resolution within 24 h and can be performed after MALDI-MSI, preserving metabolite distributions for cell-accurate co-registration on the same section. In mouse brain, integration with MALDI-MSI resolved metabolic heterogeneity at the level of individual cells. OpenFISH also quantified cell type-specific transcriptional activation of transposable elements after systemic lipopolysaccharide (LPS) challenge and detected disrupted spatial organization of D1 striatal neurons in Reeler mutants. Benchmarking showed performance comparable to or exceeding commercial platforms at [~]0.5% of per-sample cost. By enabling same-section, near-single-cell co-mapping of transcripts and metabolites in an accessible workflow, OpenFISH provides a scalable framework for high-content spatial multi-omics across neuroscience, immunology, cancer biology, and beyond.

9

AFM-Fold: Rapid Reconstruction of Protein Conformations from AFM Images

Kawai, T.; Matsunaga, Y.

2025-11-26 biophysics 10.1101/2025.11.17.688836 medRxiv

Top 0.1%

60.6%

Show abstract

High-speed atomic force microscopy (HS-AFM) enables direct visualization of protein dynamics under near-physiological conditions, yet its intrinsic limitation to surface topography prevents atomic-level structural characterization. We present AFM-Fold, a generative AI-based framework that reconstructs three-dimensional protein conformations directly from AFM images. AFM-Fold combines a rotation-equivariant convolutional neural network, which extracts low-dimensional collective variables (CVs) from AFM images, with a guided diffusion process that generates conformations consistent with the inferred CVs. Using pseudo-AFM images of adenylate kinase, AFM-Fold accurately reproduced not only the open and closed conformations, but also intermediate states. Application to 159 experimental HS-AFM frames of the flagellar protein FlhAC further demonstrated that AFM-Fold outperforms rigid-body fitting and captures time-correlated domain motions that reflect underlying conformational dynamics. AFM-Fold enables rapid, physically plausible structure estimation from individual AFM images, typically within one minute per frame, without relying on molecular dynamics simulations. This unified and computationally efficient pipeline opens a route to high-throughput structural analysis of HS-AFM movies.

10

LivecellX: A Deep-learning-based, Single-Cell Object-Oriented Framework for Quantitative Analysis in Live-Cell Imaging

Ni, K.; Yu, G.; Zheng, Z.; Lu, Y.; Poe, D.; Zhang, S.; Wang, Z.; Khurana, Y.; Lu, Y.; Chen, Y.; Zhou, S.; Sanborn, M.; Wang, W.; Xing, J.

2025-05-14 biophysics 10.1101/2025.02.23.639532 medRxiv

Top 0.1%

60.6%

Show abstract

Live-cell imaging uniquely captures single-cell dynamics in space and time, but robust analysis is limited by segmentation and tracking errors that accumulate across frames. We present LivecellX, a deep-learning-based pipeline that integrates instance-level segmentation error correction with trajectory refinement, leveraging temporal context to recover accurate cell tracks. LivecellX also introduces a benchmark dataset with detailed annotations of common error classes, providing a resource for method development and evaluation. Beyond error correction, the framework incorporates modules for classifying biological processes, reconstructing cell lineages, and analyzing dynamic behaviors. Users can interact with the system programmatically or through a Napari-based graphical interface, enabling flexible integration into diverse workflows. By coupling error-aware correction with comprehensive lineage and dynamics analysis, LivecellX establishes an open, extensible platform that advances the accuracy and scalability of live-cell imaging studies.

11

Segment Any Plant (SAP): Foundation-Model Segmentation for Plant Time-Series Phenotyping

Abbey, A.; Meroz, Y.

2026-03-13 biophysics 10.64898/2026.03.11.711099 medRxiv

Top 0.1%

60.0%

Show abstract

Quantitative studies of plant growth and environmental responses increasingly rely on time-series imaging, yet automated segmentation remains challenging due to continuous growth, large non-rigid morphological change, and frequent self-occlusion. Traditional image-processing pipelines and taskspecific deep learning models often require extensive annotated datasets and retraining, limiting portability across species, developmental stages, and imaging conditions. Here we present SAP (Segment Any Plant), a plant-focused framework that leverages the pretrained Segment Anything Model 2 (SAM2) to enable few-shot, training-free segmentation of plant timeseries imagery. SAP integrates interactive prompting, automated temporal mask propagation, and centerline extraction within a web-based interface, allowing users to move from raw images to quantitative descriptors of organ shape and dynamics without programming expertise. Across multiple systems, including Arabidopsis thaliana rosette development, root growth, sunflower gravitropism, and confocal root microscopy, SAP achieves high segmentation accuracy (mean IoU 0.890.93) and sub-pixel centerline precision from single-frame prompting. By reducing the need for task-specific retraining, SAP provides a transferable framework for reproducible time-series phenotyping across diverse experimental contexts.

12

EASE: EM-Assisted Source Extraction from calcium imaging data

Zhou, P.; Reimer, J.; Zhou, D.; Pasarkar, A.; Kinsella, I. A.; Froudarakis, E.; Yatsenko, D.; Fahey, P.; Bodor, A.; Buchanan, J.; Bumbarger, D. J.; Mahalingam, G.; Torres, R.; Dorkenwald, S.; Ih, D.; Lee, K.; Liu, R.; Macrina, T.; Silversmith, W.; Wu, J.; Wong, W.; Macarico da Costa, N.; Reid, R. C.; Tolias, A.; Paninski, L.

2020-03-25 neuroscience 10.1101/2020.03.25.007468 medRxiv

Top 0.1%

59.6%

Show abstract

Combining two-photon calcium imaging (2PCI) and electron microscopy (EM) provides arguably the most powerful current approach for connecting function to structure in neural circuits. Recent years have seen dramatic advances in obtaining and processing CI and EM data separately. In addition, several joint CI-EM datasets (with CI performed in vivo, followed by EM reconstruction of the same volume) have been collected. However, no automated analysis tools yet exist that can match each signal extracted from the CI data to a cell segment extracted from EM; previous efforts have been largely manual and focused on analyzing calcium activity in cell bodies, neglecting potentially rich functional information from axons and dendrites. There are two major roadblocks to solving this matching problem: first, dense EM reconstruction extracts orders of magnitude more segments than are visible in the corresponding CI field of view, and second, due to optical constraints and non-uniform brightness of the calcium indicator in each cell, direct matching of EM and CI spatial components is nontrivial. In this work we develop a pipeline for fusing CI and densely-reconstructed EM data. We model the observed CI data using a constrained nonnegative matrix factorization (CNMF) framework, in which segments extracted from the EM reconstruction serve to initialize and constrain the spatial components of the matrix factorization. We develop an efficient iterative procedure for solving the resulting combined matching and matrix factorization problem and apply this procedure to joint CI-EM data from mouse visual cortex. The method recovers hundreds of dendritic components from the CI data, visible across multiple functional scans at different depths, matched with densely-reconstructed three-dimensional neural segments recovered from the EM volume. We publicly release the output of this analysis as a new gold standard dataset that can be used to score algorithms for demixing signals from 2PCI data. Finally, we show that this database can be exploited to (1) learn a mapping from 3d EM segmentations to predict the corresponding 2d spatial components estimated from CI data, and (2) train a neural network to denoise these estimated spatial components. This neural network denoiser is a stand-alone module that can be dropped in to enhance any existing 2PCI analysis pipeline.

13

Global Neuron Shape Reasoning with Point Affinity Transformers

Troidl, J.; Knittel, J.; Li, W.; Zhan, F.; Pfister, H.; Turaga, S. C.

2025-03-11 neuroscience 10.1101/2024.11.24.625067 medRxiv

Top 0.1%

59.6%

Show abstract

Connectomics is a field of neuroscience that maps the brains intricate wiring diagram. Accurate neuron segmentation from microscopy volumes is essential for automating connectome reconstruction. However, state-of-the-art algorithms use image-based convolutional neural networks limited to local neuron shape context. Thus, we introduce a new framework that reasons over global neuron shape with a novel point affinity transformer. Our framework embeds a (multi-)neuron point cloud into a fixed-length feature set from which we can decode any point pair affinities, enabling clustering neuron point clouds for automatic proofreading. We also show that the learned feature set can easily be mapped to a contrastive embedding space that enables neuron type classification using a simple classifier. Our approach excels in two demanding connectomics tasks: correcting segmentation errors and classifying neuron types. Evaluated on three benchmark datasets derived from state-of-the-art connectomes, our method outperforms point transformers, graph neural networks, and unsupervised clustering baselines.

14

Reconstructing True 3D Spatial Omics at Single-Cell Resolution

Yang, Y.; Luo, Y.; Zhang, K.; Bu, Y.; Xia, Z.; Peng, H.; Yan, R.; Liu, Q.; Chen, Y.; Shen, L.; Chen, E.

2026-05-01 bioinformatics 10.64898/2026.04.28.721395 medRxiv

Top 0.1%

59.1%

Show abstract

Capturing the three-dimensional (3D) organization of cells is essential for deciphering complex biological processes, yet comprehensive 3D spatial omics is severely hindered by the destructive nature of physical sectioning and the depth limitations of intact tissue imaging. Current computational methods rely on 2.5D stacking of discrete slices, which inherently disrupts tissue topology and fails to resolve continuous depth-dependent molecular gradients. To bridge this gap, we introduce DO_SCPLOWEEPC_SCPLOWSO_SCPLOWPATIALC_SCPLOW, an Optimal Transport flow matching framework that models tissue evolution as a continuous dynamic vector field. By solving the underlying probability flow ODEs, DO_SCPLOWEEPC_SCPLOWSO_SCPLOWPATIALC_SCPLOW enables the direct extraction of uninterrupted, infinitely resolvable tissue states at arbitrary spatial depths. Using Deep STAR/RIBOmap 3D technologies, we demonstrate that DO_SCPLOWEEPC_SCPLOWSO_SCPLOWPATIALC_SCPLOW achieves improved 3D reconstruction fidelity relative to 2.5D approaches, yielding structures that more closely recapitulate native tissue microenvironments in real-world datasets. Across diverse spatial omics modalities, including spatial proteomics using imaging mass cytometry in human breast cancer and spatial transcriptomics using openST in head and neck squamous cell carcinoma metastatic lymph nodes, DO_SCPLOWEEPC_SCPLOWSO_SCPLOWPATIALC_SCPLOW produces biologically interpretable and high-fidelity reconstructions across datasets. We evaluated the scalability and robustness of DO_SCPLOWEEPC_SCPLOWSO_SCPLOWPATIALC_SCPLOW on a large-scale mouse brain dataset, reconstructing a continuous 3D cellular atlas comprising 39 million cells within 41.6 hours. Systematic downstream characterization validated its ability to recapitulate consistent spatial architectures, cell-type distributions, transcriptomic patterns, and microenvironmental structures across brain regions. Collectively, these results demonstrate DO_SCPLOWEEPC_SCPLOWSO_SCPLOWPATIALC_SCPLOW as a generalizable and efficient solution for true 3D spatial reconstruction across scales and modalities. O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=117 SRC="FIGDIR/small/721395v2_ufig1.gif" ALT="Figure 1"> View larger version (49K): org.highwire.dtl.DTLVardef@1a19624org.highwire.dtl.DTLVardef@188361forg.highwire.dtl.DTLVardef@199321corg.highwire.dtl.DTLVardef@a8f411_HPS_FORMAT_FIGEXP M_FIG C_FIG

15

DeepSRFusion: a point cloud deep learning framework for super-resolution particle fusion

Qiao, Y.; Wang, J.; Xi, J.; Ding, J.; Chen, T.; Zhang, Y.; Qiu, L.; Zhao, W.; Liu, J.; Xu, F.

2026-03-01 biophysics 10.64898/2026.02.26.708405 medRxiv

Top 0.1%

58.2%

Show abstract

Deciphering the spatial organization of macromolecular complexes in their native context is central to structural biology. Particle fusion in single-molecule localization microscopy offers a unique capability for high-resolution structural reconstruction in situ. However, existing methods face significant challenges from large rotational perturbations and sparse labeling, resulting in compromised accuracy and substantial computational cost. We present DeepSRFusion, a self-supervised pretraining framework for three-dimensional super-resolution particle fusion. By representing single-molecule point clouds as Gaussian Mixture Models, DeepSRFusion integrates data-driven feature learning with physical imaging constraints. A two-stage optimization strategy with dynamic template updating enhances robustness, and a novel Clustering Error metric quantifies fusion quality. Nanometer-scale validation on both simulated and experimental datasets demonstrates high reconstruction fidelity and structural consistency with cryo-electron microscopy and AlphaFold3. DeepSRFusion remains effective under challenging imaging conditions, including large 3D rotations, sparse labeling, high localization uncertainty, and limited particle numbers, while achieving over 100-fold speedups compared to current methods. It resolves fine structural features with a measured spatial resolution of 1.60 {+/-} 0.10 nm, sufficient to distinguish ~10 nm spaced protein pairs and visualize tilted internal substructures within macromolecular assemblies. DeepSRFusion provides a powerful tool for high-precision structural analysis in native cellular environments.

16

BeadBuddy: user-friendly, nanometer-scale registration of single-molecule imaging data

Lionnet, T.; Clark, F. T.; Whitney, P. H.; Saiz, N.; Ziarno, A.

2025-11-29 biophysics 10.1101/2025.11.25.690467 medRxiv

Top 0.1%

55.0%

Show abstract

Single-molecule localization microscopy (SMLM) captures nanoscale detail with fluorescence imaging. As large-area cameras and multiplex imaging become standard, chromatic errors are growing more complex, challenging precise analysis. To address this challenge, we present BeadBuddy, an open-source, user-friendly software that uses images of fluorescent beads to model and correct 3D, spatially varying chromatic errors. BeadBuddy achieves sub-voxel resolution in DNA Fluorescence In Situ Hybridization (FISH) and is applicable across SMLM modalities.

17

Escalating High-dimensional Imaging using Combinatorial Channel Multiplexing and Deep Learning

Ben-Uri, R.; Ben Shabat, L.; Bar-Tal, O.; Bussi, Y.; Maimon, N.; Keidar Haran, T.; Milo, I.; Elhanani, O.; Rochwarger, A.; Schürch, C. M.; Bagon, S.; Keren, L.

2023-09-12 systems biology 10.1101/2023.09.09.556962 medRxiv

Top 0.1%

54.9%

Show abstract

Understanding tissue structure and function requires tools that quantify the expression of multiple proteins at single-cell resolution while preserving spatial information. Current imaging technologies use a separate channel for each individual protein, inherently limiting their throughput and scalability. Here, we present CombPlex (COMBinatorial multiPLEXing), a combinatorial staining platform coupled with an algorithmic framework to exponentially increase the number of proteins that can be measured from C up to 2c - 1. In CombPlex, every protein can be imaged in several channels, and every channel contains agglomerated images of several proteins. These combinatorically-compressed images are then decompressed to individual protein-images using deep learning. We achieve accurate reconstruction when compressing the stains of twenty-two proteins to five imaging channels and demonstrate that the approach works in both fluorescence microscopy and in mass-based imaging. Combinatorial staining coupled with deep-learning decompression can escalate the number of proteins measured using any imaging modality, without the need for specialized instrumentation. Coupling CombPlex with instruments for high-dimensional imaging could pave the way to image hundreds of proteins at single-cell resolution in intact tissue sections.

18

Foundation model for efficient biological discovery in single-molecule data

Li, J.; Zhang, L.; Johnson-Buck, A.; Walter, N. G.

2024-08-27 biophysics 10.1101/2024.08.26.609721 medRxiv

Top 0.1%

54.7%

Show abstract

Modern data-intensive techniques offer ever deeper insights into biology, but render the process of discovery increasingly complex. For example, exploiting the unique ability of single-molecule fluorescence microscopy (SMFM)1-5. to uncover rare but critical intermediates often demands manual inspection of time traces and iterative ad hoc approaches that are difficult to systematize. To facilitate systematic and efficient discovery from SMFM data, we introduce META-SiM, a transformer-based foundation model pre-trained on diverse SMFM analysis tasks. META-SiM achieves high performance--rivaling best-in-class algorithms--on a broad range of analysis tasks including trace selection, classification, segmentation, idealization, and stepwise photobleaching analysis. Additionally, the model produces high-dimensional embedding vectors that encapsulate detailed information about each trace, which the web-based META-SiM Projector (https://www.simol-projector.org) casts into lower-dimensional space for efficient whole-dataset visualization, labeling, comparison, and sharing. Combining this Projector with the objective metric of Local Shannon Entropy enables rapid identification of condition-specific behaviors, even if rare or subtle. As a result, by applying META-SiM to an existing single-molecule Forster resonance energy transfer (smFRET) dataset6, we discover a previously unobserved intermediate state in pre-mRNA splicing. META-SiM thus removes bottlenecks, improves objectivity, and both systematizes and accelerates biological discovery in complex single-molecule data.

19

LLM-autonomous development of deep learning models for quantitative microscopy

Zhou, X.; Wang, S.

2026-04-08 bioengineering 10.64898/2026.04.03.716415 medRxiv

Top 0.1%

54.6%

Show abstract

Deep learning can extract quantitative measurements from microscopy images that are inaccessible to classical analysis, but developing these models requires machine learning expertise that most imaging scientists do not have. Here we present a framework in which a researcher describes their microscopy problem to a large language model (LLM) agent in under ten minutes of conversation--specifying what they image, what they want to measure, and what success looks like--and the agent autonomously handles the rest: designing physics-based training data, implementing a neural network, training, diagnosing failures, and iterating without human intervention. A researcher can start the agent before leaving the lab; overnight, it tests tens to a hundred model variations, each one an experiment that would otherwise demand active attention. We validate the framework across six microscopy modalities and four problem types. On the BBBC039 nuclear segmentation benchmark, the agent autonomously trains a U-Net with 3-class semantic segmentation and morphological post-processing, achieving pixel-level Dice of 0.97 and object-level F1 of 0.84--within 7% of the published baseline--while diagnosing a data pipeline bug that no amount of hyperparameter tuning could resolve. On single-protein holographic microscopy, the agent reads a published paper, designs a simulator, and develops an optimized model in a single session. On PatchCamelyon histopathology classification, the agent autonomously evolves through four optimization phases--from scratch training through transfer learning and regularization to inference-time ensembling--completing 97 iterations on 262,144 images to reach 89.3% test accuracy and 96.3% AUC, nearly matching the published rotation-equivariant baseline. This framework enables microscopy researchers to use deep learning-based image analysis without machine learning domain knowledge.

20

MDMR: Balancing Diversity and Redundancy for Annotation-Efficient Fine-Tuning of Pretrained Cell Segmentation Models

Sheikh, E. M.; Tharwat, A.; Schwan, C.; Schenck, W.

2025-11-05 bioengineering 10.1101/2025.11.04.686267 medRxiv

Top 0.1%

54.6%

Show abstract

Pretrained cell segmentation models have simplified and accelerated microscopy image analysis, but they often perform poorly on new and challenging datasets. Although these models can be adapted to new datasets with only a few annotated images, the effectiveness of fine-tuning depends critically on which images are selected for annotation. To address this, we propose CGMD (Centrality-Guided Maximum Diversity), a novel algorithm that identifies a small set of images that are maximally diverse with respect to each other in the pretrained feature space. We evaluate CGMD under an extremely low annotation budget of just two images per dataset for fine-tuning the pretrained Cellpose Cyto2 model on four different 2D+t datasets from the Cell Tracking Challenge. CGMD consistently outperforms six competitive active learning and subset selection methods and approaches the performance of fully supervised fine-tuning. The results show that centrality-guided maximum diversity subset selection enables stable and annotation-efficient fine-tuning of pretrained cell segmentation models. The code is publicly available at: https://github.com/eiram-mahera/cgmd.